REMARKS 

This is in response to the Office Action mailed on June 28, 2007, In the Office 
Action, claims 3, 4, and 16 were objected to. Claims 1246 were rejected under 35 U.S.C, § 101 
and claims 1-16 were rejected under 35 U.S.C. § 102(b). M this Office Action, claims 1, 3, 5, 8, 
and 12-16 have been amended. 

The Office Actions indicates that claims 3, 4, and 16 are objected to. On page 2 
of the Office Action it was noted that "claim 3 depends on itself, it is believed it should depend 
on claim 2 and is treated as such/' The Office Action further pointed out that claim 3 discussed 
the limitation "disconnected phonetic hypotheses'' which has no proper antecedent basis in the 
specification. In light of the Office Action's remarks, the preamble of claim 3 has been amended 
to 'The method of claim 2,..,". Claim 3 has also been amended by replacing the term 
"disconnected phonetic hypotheses*' with "inconsistent phonetic hypotheses," which has proper 
antecedent basis on page 10 of the appKcation, 

The Office Action states that claim 4 was objected to as "the hmitation ranking 
the plurality of phonetic hypotheses identified" has no proper antecedent basis in the 
specification. Claim 4 does not refer to "ranking the plurality of phonetic hypotheses identified," 
however claim 5 does. It is assumed that the Office Action is referring to claim 5. In Hght of the 
Examiner's remarks, claim 5 has been amended to "comparing the plurality of phonetic 
hypothesis identified" (emphasis added). Comparing the plurality of phonetic hypothesis 
identified has proper antecedent basis on page 10 of the application. 

Claim 16 was also objected to as depending on claim L The Office Action 
indicated "it is believed it should depend on claim 12 and is treated as such." In light of the 
Examiner's remarks, claim 16 has been amended to now depend off claim 12. 



Rejection Under 35 U,S.C S IQl 



On pages 2-3 of the Office Action, claims 12-16 were rejected as directed towards 
non-statutory subject matter. On page 3, paragraph 3 of the Office Action, the Examiner 
indicated that amending the claims to recite "computer storage media'' would overcome the 
rejection in a manner consistent with the Applicant's specification. In light of the Office 
Action's remarks, independent claim 12 has been amended to "a computer readable storage 
medium." Claims 13-16 have similarly been amended to reflect a computer readable storage 
medium. It is believed the term "storage medium" is sufficient to overcome the 35 USC §101 
objection* 

Rejection Under 35 U.S.C. S 102 
On page 3 of the Office Action, claims M6 were rejected as being anticipated by 
James et al. ("A Fast Lattice-Based Approach to Vocabulary Independent Wordspotting", 

hereinafter "James'O- 

Claim 1 has been amended to recite: 

"A method of searching audio data, comprising: 

receiving a query comprising a grammar corresponding to pronunciation alternatives that 

define multiple phonetic possibilities for a segment of input speech ; and 
comparing the query with a lattice of phonetic hypotheses associated with the audio data 
to identify if at least one of the multiple phonetic possibilities is approximated by 
at least one phonetic hypothesis in the lattice of phonetic hypotheses/' (emphasis 
added). 

As pointed out in the application on page 12, paragraphs 1-2, the query can be a grammar 
corresponding to pronunciation alternatives that define multiple phonetic possibiliti^. In one 
embodiment, the grammar query can be represented as a weighted finite-state network. The 
grammar may also be represented by a context free grammar, unified language model, N-gram 
model and/or a prefrix tree, for example. As shown in the application on page 12, paragraph 3, 
complex expressions such as telephone numbers and dates can be searched based on an input 
grammar defining these expressions. Alternative pronunciations can be searched within the 
database simultaneously as well, providing an advantage over other non-grammar based queries. 

The Office Action cited James, page 1, column 2, paragraph 4 comprising 
receiving a query defining multiple phonetic possibilities, by using the term 'keyword 



pronunciation," Applicants respectfully point out that nowhere in the cited sections of the James 
reference does it point towards a grammar-based system of receiving a query which defines 
multiple phonetic possibilities. Instead, James merely discloses evaluation on a single phone. 
As pointed out above, the grammar-based queries allow for many advantages which are not 
pointed out in the cited sections of James. 

Claim 8 was also rejected under 35 U,S,C. § 102(b) as being rejected by James. 
Claim 8 has been amended to recite: 

"A method of generating a lattice from audio data, comprising: 

recognizing phonetic fragments within the audio data wherein at least some of the 
phonetic fragments include at least two phones; 

accessing a mutual information scor e for reco gnized phonetic fragments within the audio 
data that include at least two phones, wherein the mutual information score f or 
each of the phonetic fragments having at least two phones is a function of a 
likelihood that phones in the phonetic fragment occur consecutively and a 
likelihood that each phone in the phonetic fragment occurs independent of other 
phones in the phonetic frag ment; and 

determining a score for paths joining adjacent phonetic fragments in the audio data using 
in part the mutual information score for the phonetic fragments having at least two 
phones/ ' (emphasis added) 

As pointed out on page 11, paragraph 3 of the application, the speech recognizer 
operates based upon a dictionary of phonetic word fragments. The fragments can be determined 
based on a calculation of mutual-information of adjacent units^ which may be phonemes or 
combinations of phonemes. The equation on pg. 11 line 10 indicates that mutual information can 
be a ftinction of the likelihood that the phones in the phonetic fragment occur consecutively and a 
likeHhood that each of the phonetic fragments occur independent of other phones. Phonetic 
word fragments can be eliminated from a candidate list based upon mutual information. For 
instance, phonetic fragments that span word boundaries are eliminated from the list By merging 
phones into fragments, the lattice size is reduced, allowing for more accurate and efficient 
searching of the lattice (appUcation, pg. 10, Para. 3). 

The Office Action on page 5 states that James "discloses a method of generating a 
lattice from audio data comprising recognizing phonetic fragments within the audio data, wherein 
at least some of the phonetic fragments include at least two phones". The Office Action cited 



page 1, column 2, paragraph 3 of James as using a modified Viterbi HMM-based phone 
recognizer. Applicants respectfully traverse this rejection in light of the following. The lattice on 
figure 1, page 2 of James indicates that the lattice comprises a series of nodes. Each node is a 
single phone. Each node is linked to another node, which again comprises another single phone, 
to create a lattice. However, nowhere in this reference does it state that mutual information is 
used to determine phonetic fiagments comprising more than one phone. As indicated, the nodes 
in James comprise a single phone. Furthermore, nowhere does the James reference cite using 
mutual information as described above. The phonetic fragments used to construct a lattice of the 
application can consist of more than one phone, unlike the lattice used on fig. 1, page 2 of James. 
Nowhere does the Office Action disclose that the Vitberi HMM-based phone takes mutual 
information into account to recognize phonetic firagments comprising more than one phone. As a 
result, claim 8 is believed to be allowable over James. 

The Office Action also rejected claim 12 is also rejected under 35 U.S.C. § 102. Claim 
12 has been amended to: 

"A computer readable storage medium encoded with a data structure, comprising: 

a plurality of phoneme hypotheses and an associated score for each hypothesis, wherein at 
least some of the hypotheses form phonetic fragments that include at least two 
phones, and wherein the score for each phonetic fragment that includes at least 
two phones is a function of a likelihood that phones in the phonetic fragment 
occur consecutively and a likelihood that each phone in the phonetic fragment 
occurs independent of other phones in the phon etic firaement and 
a plurality of transitions connecting the phoneme hypotheses." (emphasis added) 

As pointed out above, the citations to the current reference do not include determining phonetic 
fragments which can comprise more than one phone through mutual information. Thus, claim 12 
is believed to be allowable. 

Conclttsion 

It is therefore respectftilly submitted in claims 1-16 are in form for allowance. 
Reconsideration and allowance of the clmms is respectfully requested. 
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The Director is authorized to charge any fee deficiency required by this paper or 
credit any overpayment to Deposit Account No. 23-1 123. 

Respectfully submitted, 
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